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Abstract. Cellular automata (CA) consist of an array of identical cells, 
each of which may take one of a finite number of possible states. The 
entire array evolves in discrete time steps by iterating a global evolution 
G. Further, this global evolution G is required to be shift-invariant (it 
acts the same everywhere) and causal (information cannot be transmit- 
ted faster than some fixed number of colls per time step). At least in the 
classical [13], reversible [17] and quantum cases [1], these two top-down 
axiomatic conditions arc sufficient to entail more bottom-up, operational 
descriptions of G. We investigate whether the same is true in the prob- 
abilistic case. 

Keywords: Characterization, noise, Markov process, stochastic Ein- 
stein locality, screening-off, common cause principle, non-signalling, Multi- 
party non-local box. 

1 Introduction 

Due to its built-in symmetries, CA constitute a clearly physics-like model of 
computation [18]. They model spatially distributed computation in space as we 
know it [25,19], and therefore they constitute a framework for studying and 
proving properties about such systems - by far the most established. Conceived 
amongst theoretical physicists such as Ulam and Von Neumann [29], CA were 
soon considered as a possible model for particle physics, fluids, and the related 
differential equations. There are numerous results on this approach, often under 
the name of Lattice-gas cellular automata [33, 32, 9, 24]. More generally, CA are 
one of the main theoretical tools of 'complex sciences', where one studies the 
emergence of complex, global behaviours as arising from simple local interac- 
tions. There, CA have proved useful for modelling an incredible variety of things 
ranging traffic jams [20] to demographics and regional development or consump- 
tion [6, 31]. 

Each of this variety of contexts brings its own set of reasons to study proba- 
bilistic extensions of CA. This trend of work has already started, and generated 



fascinating questions. For example, are there CA which can defend themselves 
against an everywhere present noise? Or is it the case that any initial configu- 
rations will ultimately be entirely erased, i.e. that such CA have only one limit 
distribution? In [26, 27] it was shown, via an extremely convoluted argument 
which many would like to simplify [23,7,12], that there exists CA which are 
resistant to this noise model. However, perhaps it would help to address these 
issues to have a robust definition of Probabilistic CA (PCA) . All of these papers 
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Fig. 1. (a) Parity. Time flows upwards. Wires are bits. Boxes are (randomized) gates, 
i.e. stochastic matrices. Boxes marked with a circle are coin tosses. Boxes marked ® 
perform addition modulo two. Input cells . . . n are ignored. Output random variables 
(Xi)i=i.,.„ have the property that for strict subset / of 1 . . . n, Xj is uniformly dis- 
tributed, yet their global parity is always 0. (b) As such Parity is a localized stochastic 
map, but it can be modified into a translation-invariant stochastic map over entire con- 
figurations of the form . . . qqOlOOlqq . . ., by ensuring that for any x £ {q, 0, 1}, qx i-¥ x, 
xq 1-^ X, qq ^ q, and hence making the coin tosses conditional upon their input cells 
not being q. Parity does not pertain to the class of Standard-PCA, but it ought to be 
considered a valid PCA. 



begin with a definition of PCA, but these definitions are all variations of the 
same concept. Namely, these are stochastic maps having the property that they 
decompose in two phases: first, a classical CA is applied, and second, a model of 
noise is applied (a stochastic matrix is applied homogeneously on each individual 
cell) . We refer to this class of stochastic maps as Standard-PCA and now explain 
its several drawbacks. 

First, this class is incomplete. There are many stochastic maps which ought 
to be called PCA but are not Standard-PCA, not even with increased cell or 
neighbourhood sizes. Parity is a concrete example of this, see Figure 1. Indeed, 
Parity is translation-invariant, implementable by local means, but it may gen- 
erate statistical correlations between any two adjacent regions of cells, which is 
never the case of Standard-PCA. Of course Parity can be implemented as the 
square (i.e. two time-steps) of a Standard-PCA, which points out to a second, 
even worse problem with this class: the composition of two Standard-PCA is 
not necessarily a Standard-PCA. This is counter-intuitive for a notion of PCA. 
Thirdly, the intuition behind this class is simple but ad-hoc; i.e. not founded 
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upon some meaningful high-level principles. 

This paper aims to define PCA in a more robust manner, which is a long-standing 
problem. Criteria for a good axiomatic definition include: being composable, and 
being based on high-level principles, while entailing an operational description 
(i.e. implementation by local mechanisms). 

Why not start right-away from the latter, operational description? If not en- 
tailed by well-agreed principles, the operational description may be incomplete 
and ad-hoc, as with Standard-PCA. Moreover, axiomatic definitions tend to have 
a practical interest as simple characterizations of the operational descriptions. 
In fact a great deal of foundational mathematical results work that way. When 
an axiomatic definition is given an operational description, we speak of a 'struc- 
ture theorem' or a 'representation theorem' (e.g. spectral decomposition of a 
unitary operator as ^ e'"' | (,:',;) (V'il)- When an operational description is given an 
axiomatic definition, we talk of a 'characterization' (e.g. Hcidlund's charactciri- 
zation of a CA as the set of shift-invariant continuous functions in the Cantor 
topology). 

Why not generalize Hedlund's characterization of CA to PCA? We will begin 
by investigating precisely that route and show that continuity arguments fail to 
characterize PCA (Section 3), as they do not forbid spontaneous generation of 
statistical correlation in arbitrarily distant places (Section 4) . Counter-examples 
are necessarily non-signalling, non-local stochastic maps. That such objects can 
exist is now a well-known fact in quantum theory, where non-signalling non- 
locality arises from entanglement in Bell's test experiment [5]. Recently their 
study has been abstracted away from quantum theory, for instance via the rather 
topical NLBox [4,21,3]. 

This points out the weakness of the non-signalling condition in the proba- 
bilistic/stochastic setting, which is a well-known issue in foundations of physics. 
In this context, more robust causality principles have been considered. In fact. 
Bell's test experiments are motivated by a 'local causality' , of which there exist 
several variants, all of them stemming from Reichenbach's 'principle of common 
cause'. Now, since PCA are nothing but simplifications of those spaces used in 
physics, this principle of common cause, if it is at all robust, should therefore 
lead to the axiomatization of PCA. We investigate this intriguing question (Sec- 
tion 5) and answer it in an unexpected fashion (Section 6). First we formalize 
the problem. 

2 Problem statement 

Definition 1 (Configurations) A finite (unbounded) configuration c over a 
finite alphabet S is a function c : Z — > S, with i i — > c{i) = Ci, such that there 
exists a (possibly empty) finite set I satisfying i ^ I =^ Ci = q, where q is a 
distinguished quiescent state of S. We denote C, this set of finite configurations; 

it is countable. 

Countability is not crucial to the arguments developed in this paper, but it 
certainly makes the following definitions much easier: 
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Definition 2 (Random variables, states) Random variables denoted by X 
range overC, i.e. entire configurations. Random variables denoted by Xi range 
over S, i. e. cells. Random variables denoted by Xj range over I ^ S, i. e. sets of 
cells. Random variables corresponding to time t, are denoted X* , Xj. The state 

of the random variable Y , denoted py , is the functions from range{Y) to [0, 1] 
such that pvijj) = Pr{Y = y), with Pr{Y = y) the probability mass function of 
Y at y, i. e. they denote the law of distribution. For convenience in the particular 
case when Y is fully determined on y, i.e. such that Py(x) = S^y, the state py 
will be written y. Moreover for convenience pxj will be denoted p\, and referred 
to as the state of cells / at time t. Whenever I is finite, we can make p\ explicit 
as the vector {p\{w))^^^i with the entries listed in the lexicographic order. 

Definition 3 (Stochastic maps) Consider X and X' two random variables 
over the same range, and their corresponding states. We define the state ppx + 
(l-p)px' so that{ppx + {l-'p)px'){y) =PPx{y) + {^-p)px'{y)- Considers a 
function from states {range(X) — )■ [0, 1]) to states {range{Y) [0, 1]). Then S is 
a stochastic map over the range of X if and only if it is linear, i.e. S{ppx + (1 ~ 
p)px') —pSpx + (1 —p)Spx'. Whenever range{X) and range{Y) are finite, we 
can make S explicit as the stochastic matrix S = (Sx(y)) ^ iv\ ^ iv\- 
Notice that each column is the law of distribution Sx for some state x, hence 
its entries are in [0,1] and sum to one (left stochasticity) . We assume that the 
random, variables over configurations {X'*-)t^^ follow a stochastic process, i.e. 
that they form a Markov chain Pr(X"+i = a;"+^|X" = a;") = Pr(X"+i 

\ = x^,...,X^ = a;°) and obey the recurrence relation p* = G*p for 
some stochastic map G over configurations. 

Our problem is to determine what it means for a stochastic map over configu- 
rations G to be causal, meaning that arbitrarily remote regions / and J do not 
influence each other by any means. Then PCA will just be causal, shift-invariant 
stochastic maps. 



3 Continuity 

In the deterministic case CA were axiomatized by the celebrated Curtis-Lyndon- 
Hedlund Theorem[13], which we now recall. First the space of configurations is 
endowed with a metric: 

Definition 4 (Metric) The function d{., .) :CxC — > K+ such that d{c, c') = 
ifc = c! and d{c, c') = 1/2^ with k = min{i € N | C-i,,,i ^ c'_^ j} is a, m,etric. For 
c,c' <eC and e > Q we have (with n = [log2(e)J j; d{c, c') < e <^ c^n...n = c'_„...„. 

Definition 5 ((Uniform) continuity) A function F : C — > C is continuous 
if and only if for all c G C and e > 0, there exists r] > such that for all 

c' G C: d{c,c') < rj ^ d{F{c),F{c')) < e. A function F : C — > C is uniformly 
continuous if and only if for all e > 0, there exists 77 > such that for all 
c,d G C: d{c,c!) <r] ^ d{F{c),F{c!)) < e. 
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In other words a function F : C — > C is uniformly continuous if and only if for 
all n € N, there exists m € N such that for all c, c' G Cqo, C-rn...m = <^'-m...m 
implies -F(c)_„...„ = F{c')-n...n- Notice that rephrased in this manner, uniform 
continuity is a synonym for non-signalling, i.e. the fact that information does 
not propagate faster than a fixed speed bound. Continuity on the other hand 
expresses a somewhat strange form of relaxed non-signalling, where information 
does not propagate faster than a certain speed bound, but this speed bound 
depends upon the input. However, it so happens that the two notions coincide 
for compact spaces. Moreover, classical CA are easily defined upon infinite con- 
figurations Coo : Z — )■ Z", for which the same d{.,.) happens to be a compact 
metric. This yields: 

Theorem 1 (Curtis, Lyndon, Hedlund) 

A function F : Coo — > Coo is continuous and shift-invariant if and only if it is 
the global evolution of a cellular automaton over Coo ■ 

In other words this theorem just states that CA are exactly the non-signalling, 

shift-invariant fimctions. But instead of having to call them 'non-signalling' 
(a.k.a 'uniformly continuous'), it only needs to call them 'continuous', due to 
the peculiarities of Coo- However for the finite, yet unbounded configurations C, 
the metric d{., .) is not compact. In this case we must assume the stronger, uni- 
form continuity for the theorem to work. Generally speaking, it is rather difficult 
to find a relevant compact metric for probabilistic extensions of Coo — and not 
worth the effort for the sole purpose of axiomatizing PCA. Indeed, let us directly 
assume the probabilistic counterpart of non-signalling (a.k.a uniform continuity 
for some extended metric): 

Definition 6 (Non-signalling) A stochastic map over configurations G is non- 
signalling if and only if for any p' two states over configurations, and for any 
cell i, we have: 

Pi-i,i = Pi-i,i ^ iGp)i = {Gp')i. 

For example, Parity (see Figure 1) is non-signalling by construction. Is it rea- 
sonable to say, a la Hedlund, that PCA arc the non-signalling, shift-invariant 
stochastic maps? Surprisingly, this is not the case. Imagine that Alice in Paris 
(cell 0) tosses a fair coin, whilst Bob in New York (cell n+l) tosses another. 
Imagine that the two coins arc magically correlated, i.e. it so happens that they 
always yield the same result. Such a stochastic map is clearly not implementable 
by local mechanisms: we definitely need some amount of (prior) communication 
between Alice and Bob in order to make it happen. Yet it can be, as in Magic- 
Coins (see Figure 2), perfectly non-signalling. While the setup cannot be used 
to communicate 'information' between distant places, it can be used to create 
spontaneous 'correlations' between them. We must forbid this from happening. 
In this respect, assuming only (non-uniform) continuity is the wrong direction 
to take. 
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Fig. 2. MagicCoins. Inputs are ignored. Hence output random variables (Xi)i=i...„ 
have the property that for any output x, the state (a.k.a the law of distribution) px is 
independent of the inputs. Yet output random variables Xi and X„ are both uniformly 
distributed and maximally correlated. As such MagicCoins is a localized stochastic 
map, but can be modified into a translation-invariant stochastic map over entire con- 
figurations using the method described for Parity. MagicCoins is non-signalling, but it 
must not be considered a valid PCA: it is not non-correlating. 



4 Avoiding spontaneous correlations 

From the previous discussion we are obliged to conclude that the formahzation of 
a robust notion of the causality of a dynamics is indeed a non-trivial matter in a 
probabilistic setting. From the MagicCoins example, we draw the conclusion that 
such a notion must forbid the creation of spontaneous correlations between dis- 
tant places. The following definition clarifies what is meant by (non-)correlation 
between subsystems. 

Definition 7 (Independence, tensor, trace) Let I and J be disjoint. Stat- 
ing that Xj and Xj are independent is equivalent to stating that for any uv in 
{I U J ^ S), piuj{uv) = Pi{u)pj{v), with u (resp. v) the restriction of uv to I 
(resp. J). In this case we write piuj = Pi®pj- This notation is justified because 
whenever I and J are finite, we have that the law of distribution {pi\jj{uv)) 
equals {pi{u)pj{v)) which is the definition of {pi{u)) (8) {pj{v)), where (g) is the 
Kronecker /tensor product. Whether or not Xj and Xj are independent, we can 
always recover pi as the marginal of pujj by averaging over every possible v, an 
operation luhich we denote Trj and call the trace- out/marginal- out operation. 
Namely we have that pi = Trj{piuj), "with Trj{piuj){u) = Y^^pi^jiuv). 

A way to forbid spontaneous correlations is to require that, after one-time step of 
the global evolution G applied upon any initially fully determined configuration 
c, and for any two distant regions / = — oo .. .x — 1 and J = a; -|- 1 . . . oo, 
the output p = Gc be such that p/u,/ = pi ® pj- In other words remote regions 
remain independent. This formulation is somewhat cumbersome, because it seeks 
to capture in the vocabulary of 'states' a property which really belongs to their 
'dynamics'. The following definition clarifies what it means for a stochastic map 
to be localized upon a subsystem. 

Definition 8 (Extension, localization, tensor) Let I and J be disjoint. Con- 
sider a stochastic map S over I, i.e. from states {{I S) ^ [0)1]) *o them- 
selves. Then S can be trivially extended into a stochastic map S (g) Id over 
I U J, i.e. from states ((/ U J — >■ 17) — >■ [0, 1]) to themselves, and such that 
{S ® Id)pi ® Pj = {Spi) ® Pj. Whenever I and J are finite, this extends the 
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stochastic matrix [Su{u')) into [Su{u')Svv') of {S Id). We say that some 
stochastic map over lU J is localized upon I precisely if it arises in such away, 
i.e. as the trivial extension of some stochastic map S over I. Moreover if T is 
over J, then (5 (g) T) = {Id T)(S' Id) = (5" Id){Id (g) T). This notation is 
justified because whenever I and J are finite, we have that the resulting stochastic 
matrix is (Su{u')Tv{v')) which equals [Su{u'))(Si(Tv{v')) , the Kronecker /tensor 
product of both stochastic matrices. 

We can then forbid spontaneous correlations directly in terms of dynamics: 

Definition 9 (non-correlation) A stochastic map over configurations G is 
non- correlating if and only if for any output cell i, there exist stochastic maps 
A, B acting over input cells —oo...x— 1 and x . . . + oo respectively, such that: 
Tr^oG = A® B. 

For example, Parity (see Figure 1) is non-correlating by construction. Now, is 
it reasonable to say that PCA arc the non-correlating, shift-invariant stochastic 
maps? Amazingly, this is not the case. Indeed, consider a small variation of 
Parity, which we call GenNLBox and define in Figure 3. Such a stochastic map is 
clearly not implementable by local mechanisms: we definitely need some amount 
of communication between Alice and Bob in order to make it happen. It suffices 
to notice that GenNIBox is in fact a generalization of the 'non-local box', which 
we recover for n = 2, see Figure 'i(a). But then, the non-local box owes its 
name precisely to the fact that it is not implementable by local mechanisms. 
Formal proofs of this assertion can be found in the literature [4, 21] and rely on 
the fact that the NLBox (maximally) violates the Bell inequalities [5,30]. Yet 
GenNLBox (see Figure 3 (a)), was perfectly non-correlating. Hence, whilst the 
set-up cannot be used to communicate 'information' between distant places (it is 
non-signalling), nor to create spontaneous 'correlations' between distant places 
(it is non-correlating), we still must forbid it from happening! 

5 Common cause principle 

Prom the GenNLBox example of the previous Section, we are obliged to conclude 
that a robust notion of the causality of a dynamics in a probabilistic setting can- 
not be phrased just in terms of a non- signalling or a non-correlation property. 
Yet, this example has a virtue: it points towards similar questions raised in the- 
oretical physics. Hence, this suggests looking at how those issues were addressed 
so far in theoretical physics. 

Indeed, the NLBox is generally regarded as 'unphysical' because it does not com- 
ply with Bell's [5] 'local causality', meaning that there is no prior shared resource 
that one may distribute between Alice and Bob (the outputs of the middle box 
of Figure 3 (b)), that will then allow them to perform the required task sepa- 
rately. Distributing quantum resources instead of classical resources (imagining 
that the outputs of the middle box can now be entangled quantum states) will 
not fix the problem: yes it does assist Alice and Bob in performing the required 
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Fig. 3. (a) GenNLBox. Input cells l...n— 1 are ignored. Output random variables 
{Xi)i=i.,.„ have the property that for strict subset 7 of 1 . . . n, X/ is uniformly dis- 
tributed, yet the global parity is always ab. As such GenNLBox is a localized stochastic 
map, but can be modified into a translation-invariant stochastic map over entire con- 
figurations using the method described for Parity. GenNLBox is both non-signalling 
and non-correlating, but it must not be considered a valid PCA: it is not F-causal. 
(b) NLBox. The output random variables Xi and X2 are uniformly distributed, but 
their parity is always equal ab. No circuit of the displayed dotted form can meet these 
specifications [5,30]. 



task separately [2]; but only approximately so [28]. 

Bell's 'local causality' [5], 'Screening- off' [11], 'Stochastic Einstein locality' [14, 
8] , are all similar conditions, which stem from improvements upon Reichenbach's 
'Principle of Common Cause' [16, 22], as was nicely reviewed in [15]. The common 
cause principle can be summarized as follows: "Two events can be correlated at 
a certain time if and only if, conditional upon the events of their mutual past, 
they become independent" . In the context of this paper, this gives: 

Definition 10 (Screening-off ) A stochastic map over configurations G obeys 
the screening-off condition if and only if for any input cell i with values in E, 
we have that there exists stochastic maps {Ax,Bx)xei: acting over input cells 
— 00 ... i — 1 and i + 1 . . . + 00 respectively, such that for any p: 

Pi = x Gp= {Ax (g) Bx)p. 

Here input i is said to screen-off G. 

This screening-off condition is physically motivated and does not suffer the prob- 
lems that the non-signalling and non-correlation conditions had. Unfortunately 
however, it suffers most of the problems of the original, Standard-PCA defini- 
tion: it is again incomplete and non-composable. For example, Parity does not 
obey the screening-off condition. Yet, Parity is a natural PCA, clearly imple- 
mentable by local mechanisms as was shown in Figure 1. But the prior shared 
resource which is necessary in order to separate Alice from Bob is not present in 
the input cells, rather it is generated on-the-fly within one time-step of G, c.f. 
the circle-marked boxes. In other words, the reason why the screening-off condi- 
tion rejects Parity, is because the condition is too stringent in demanding that 
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screening-ofF events be made explicit in the inputs x. A more relaxed condition 
would be to require that x may be completed so as to then screen-off G. 

Definition 11 (Screening-off-completable) A stochastic map over configu- 
rations G is screening-ojf-completable, or simply V-causal if and only if for any 
input cell i, we have that there exists stochastic maps (with input/output ranges 

marked as sub /superscript indices) AZ'^"'lzY , i, Cl'^ , Rl\] , ^i+i^^oo°° ■^^^^ 
that: 

G = (L® R){A(g,C ® B). 
Here C is said to screen-off G at i. See Figure 4 (a). 
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Fig. 4. (a) G is F-causal if it can be put in this form, for each i. (b) A strengthened 
condition: W-causality. 



Again, is it reasonable to say that PCA are the screening-off-completable, shift- 
invariant stochastic maps? Again, this is not the case. Indeed, consider a small 
variation of Parity, which we call VBox and define in Figure 5. Such a stochas- 
tic map is not implementable by local mechanisms: we need some amount of 
communication between Alice and Bob in order to make it happen. Yet VBox is 
perfectly F-causal, as shown in Figure 5. Hence, whilst the set-up is screening-off- 
completable, we still must forbid it, our condition is again too weak. A natural 
family of conditions to consider is FV^-causality (as defined in Figure 4 (b)), 
F^-causality etc. This route seems a dead end; we believe that the VBox is the 
k = 1 instance of the following more general result: 

Conjecture 1 For all k, there exists a V'^Box which is V'^ -causal but not 1/*^+^- 
causal. 



6 Concluding definition 

Our best definition. We have examined several, well-motivated causality princi- 
ples (non-signalling, non-correlation, F-causality) and shown, through a series of 
surprising counter-examples, that stochastic maps with these properties are not 
necessarily implementable by local mechanisms. In the limit when k goes to infin- 
ity, y '^-causality turns into the following definition (assuming shift-invariance): 
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Fig. 5. VBox. Input cells 1 ... n — 1 arc ignored. Output random variables (Xi)i=i.,.„ 
have the property that for strict subset / of 1 ... n, Xi is uniformly distributed, yet 
their global parity is, with equal probability, 06 or 0. As such VBox is a localized 
stochastic map, but can be modified into a translation-invariant stochastic map over 
entire configurations using the method described for Parity. VBox is non-signalling, 
non-correlating and V-causal, but it must not be considered a valid PCA: it is not 
yy-causal. The VBox can be seen as a chain of two GenNLBoxes, chaining more of 
them yields a V'^Box. 



Definition 12 (Probabilistic Cellular Automata) A stochastic map over 
configurations G is a PCA if and only if 

G = ((g)i?)((g)C) 

where the i^^ stochastic matrix C has input i and outputs li,ri, and the i*'* 
stochastic matrix D has inputs ri-i,li and output i, see Figure 6. 



(b) G' 



(a) G 



D' 



^ I 



D' 



D' 



I c^ZZZZZZ^ I 



D' ' D' D' ' D 



C' c 



D 



D D 



C 



C C I c c 



Fig. 6. (a) An operational definition of PCA. Time flows upwards. Wires stand 

for cells. The first two layers of boxes describe one application of a PCA G = 
((^-D)((^C). (b) Its compositionality. The second two layers describe one applica- 
tion of a PCA G' = (iS) D'){^C'). The resulting composition is a PCA G'G = 
( 0{D' ® D')C'D) ( (g) C'D{C ® C)) over the hatched supercells of size 2. 



This definition has several advantages over Standard-PCA. First, it is complete, 
because it captures exactly what is meant (up to grouping) by a shift-invariant 
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stochastic map implementable by local mechanisms. Second, it is composable, 
as shown in Figure 6. Third, it is less ad-hoc. Indeed, we have mentioned in the 
introduction that much mathematics is done by having axiomatic definitions and 
operational description to coincide (usually by means of structure versus char- 
acterization theorems). It could be said that the same effect has been achieved 
in this paper, but in a different manner. Indeed, in this paper we have con- 
sidered the natural candidate axiomatic definitions of PCA, whose limit is the 
natural operational description of PCA, and we have discarded each of them 
— by means of counter-examples. In this sense, we have pushed this family of 
candidate axiomatic definitions to its limit, i.e. as far as to coincide with the 
operational description. 

Discussion. Still, we cannot really pretend to have reached an axiomatization, 
nor a characterization of PCA: Definition 12 can be presented as just the square 
of two Standard-PCA; or even more simply as a Standard-PCA with its two 
phases reversed: first a model of noise gets applied (a stochastic matrix is applied 
homogeneously on each individual cell), and second, a classical CA is applied. If 
anything, this paper has shown that it is rather unlikely that an axiomatization 
can be achieved. The authors are not in the habit of publishing negative results. 
However, the question of an characterization of PCA a la Hedlund has been a 
long-standing issue: we suspect that many researchers have attempted to obtain 
such a result in vain. At some point, it becomes just as important to discard 
the possibility of a Theorem as to establish a new one. Moreover, advances on 
the issue of causality principles [5, 8, 14, 16, 22], have manly arisen from the dis- 
covery of counter-examples (See [15] and the more recent [10]), to which this 
paper adds a number. An impressive amount of literature [21, 3] focusses on the 
NLBox counter-example (as regards its comparison with quantum information 
processing, but also its own information processing power) and raise the ques- 
tion of n-party extensions: the GenNLBox , the VBox, and the V'^Boxes could 
prove useful in this respect. 

Finally, let us point out that the observation that the operational description of 
PCA seems to admit no other axiomatization than itself is in sharp contrast with 
both the classical case (See Hedlund's theorem Section 1), the reversible case (See 
[17]) and the quantum case (See [1]); for which the non-signalling condition alone 
suffices to entail localizability, i.e. implementation by local mechanisms. It is the 
moment when probabilities come into the picture (whether via stochastic maps 
or via quantum operations — this paper cancels out all efforts to generalize the 
result of [1] to an open systems setting) that the non-signalling condition be- 
comes too wciak. Moreover, we have seen that replacement principles based on 
a 'common cause' are hardly satisfactory. In philosophy of science it has been 
argued that the very notion of 'physical law' requires an underlying notion of 
causality, such as non-signalling. But probabilistic/stochastic theories seem to 
require a very explicit notion of causality, such as localizability. 
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